Thirty Days of Metal — Day 16: Textures
This series of posts is my attempt to present the Metal graphics programming framework in small, bite-sized chunks for Swift app developers who haven’t done GPU programming before.
If you want to work through this series in order, start here. To download the sample code for this article, go here.
So far, we have only worked with one type of resource: buffers. Although buffers are extremely versatile, sometimes we need a little more structure. In particular, sometimes we want to store images on the GPU. To meet this need, we’ll introduce a new type of resource: textures.
We have already been using textures behind the scenes: the drawables provided by our MTKView wrap color textures, and enabling depth buffering on the view causes it to allocate a depth texture for us. In this article, we will learn how to create our own textures, and how to apply them to objects in our scene to give them more apparent detail.
NOTE: When referring to a pixel in a texture, we sometimes use the more specific term texel to emphasize where it resides. I will only employ this term when it is important to distinguish between pixels and texels; otherwise, I prefer “pixel.”
Textures in Metal
Like a buffer, a texture owns a region of GPU memory. Unlike a buffer, a texture has an innate pixel format, which indicates how the data it contains should be interpreted. For example, the MTLPixelFormat.bgra8Unorm format tells us that each pixel has four components: blue, green, red, and alpha; that each of these components occupies 8 bits; and that each component should be interpreted by taking its 8-bit unsigned integer value (between 0 and 255) and dividing it by 255 to get a normalized value between 0 and 1. That’s a lot of information in a single property.
In Metal, textures can be one-dimensional, two-dimensional, or three-dimensional. Additionally, a texture can store a cube map, which is a set of six images that are stitched together to form a cube. Finally, a texture can be a texture array of any of the previous types (1D, 2D, 3D, or cube), which means it can effectively store a list of textures as a single resource.
Like buffers, textures can be stored differently depending on how often they will be accessed by the CPU and the GPU. This is determined by the texture’s storage mode. On macOS, a texture with a storage mode of MTLStorageMode.managedcan be read and written by the CPU without explicit copies, but explicit synchronization is required. A texture with a storage mode of MTLStorageMode.private can only be read and written by the GPU, so updating its contents from the GPU requires creating a “staging resource,” uploading that resource to the GPU, then using a blit command encoder to copy into the private texture. Finally, a texture with a storage mode of MTLStorageMode.shared on iOS or Macs with Apple Silicon can be read and written by the CPU and GPU without using Metal’s explicit synchronization APIs, although it is still important to pay attention to ordinary synchronization concerns like race conditions.
One more consideration when creating textures is how they will be used. Some textures, like render targets, will be written frequently, but rarely if ever read; their usage flags should include MTLTextureUsage.renderTarget. By contrast, many textures, like those used to perform texture mapping, will be read very frequently and rarely if ever updated; their usage flags should include MTLTextureUsage.shaderRead. By telling Metal how we expect to use our textures, we allow the driver to optimize the texture’s layout and access patterns.
Creating a Texture
To create a texture, we first fill out an MTLTextureDescriptor object. This parameter object has many properties that control the format and layout of the texture.
let textureDescriptor = MTLTextureDescriptor()Suppose we want to create a texture that we will populate with image data once from the CPU, then sample from repeatedly when rendering. We can start by setting the pixel format:
textureDescriptor.pixelFormat = MTLPixelFormat.bgra8UnormThen, we configure the texture type and size by setting the width and height properties. There is also a depthproperty for setting the depth of 3D textures; this is set to 1 by default.
textureDescriptor.textureType = MTLTextureType.type2D
textureDescriptor.width = 128
textureDescriptor.height = 128We set its usage flags to .shaderRead:
textureDescriptor.usage = MTLTextureUsage.shaderReadFinally, we set the texture’s storage mode to .private, since it will be used exclusively by the GPU.
textureDescriptor.storageMode = MTLStorageMode.privateNow that we have our texture descriptor populated, we create the texture by calling the makeTexture(descriptor:)method on the device:
let texture = device.makeTexture(descriptor: textureDescriptor)!Now the texture is allocated but has no contents. To fill it with image data, we would load an image into memory, copy the data into a buffer, then use a blit command encoder to populate the texture.
Fortunately, it is commonly easier to create and populate textures using a utility class from MetalKit called MTKTextureLoader.
Loading a Texture from an Asset Catalog
Asset catalogs are a powerful way to organize app resources efficiently. You have probably used them to manage your app’s icon assets, and perhaps have used them much more extensively. We will look at how to use Asset catalogs and MetalKit together to easily and efficiently create Metal textures at runtime.
To start, add a new Texture Set asset to your app’s asset catalog.
Then, drop an image file onto the image well in the asset’s overview:
You can specify different asset representations that distinguish by traits like device type, scale, and graphics family, but the default Universal representation will suffice for our needs. Make sure to set its “Origin” property to “Bottom Left”; otherwise, it will be loaded into Metal upside-down.
We can load texture assets with an instance of MTKTextureLoader. We create a texture loader by giving it a device with which it can allocate resources and do work on our behalf:
let textureLoader = MTKTextureLoader(device: device)To specify texture properties that cannot be encoded in a resource catalog, we create a dictionary of MTKTextureLoader.Options. We want to specify the texture’s storage mode and usage flags so Metal knows where to allocate the texture and how it will be used:
let options: [MTKTextureLoader.Option : Any] = [
.textureUsage : MTLTextureUsage.shaderRead.rawValue,
.textureStorageMode : MTLStorageMode.private.rawValue
]Then, to load the texture, we provide its name and our options dictionary to the loader:
texture = try? textureLoader.newTexture(name: "uv_grid",
scaleFactor: 1.0,
bundle: nil,
options: options)This operation can throw, so all of the usual caveats about error checking apply; we’re using try? for the sake of simplicity.
Texture Space and Texture Coordinates
In Metal, the origin (0, 0) of a texture is in the upper-left corner, with x increasing to the left and y increasing downward. We specify a location within a texture with an (x, y) pair of texture coordinates. Texture coordinates are normalized, meaning that x and y span from 0 to 1 regardless of the width and height of the texture.
There are various names for these coordinates. For example, you may hear x and y coordinates in texture space called u and v, hence “uv coordinates.” Metal prefers a different convention. In Metal, the horizontal axis is labeled “s,” the vertical axis is labeled “t,” and the depth axis, when used, is labeled “r.”
Texture Mapping
Texture mapping is one of the most common uses of textures. The purpose of texture mapping is to introduce more detail to a rendered surface than can be achieved with per-vertex attributes like vertex colors. Since vertex colors are interpolated by the rasterizer, we don’t have a chance to provide more detailed color information in between vertices. Texture mapping is one solution to this issue.
Texture mapping is done by “unwrapping” a three-dimensional mesh into one or more contiguous two-dimensional “islands” in texture space. This process is illustrated below.
To store this mapping, each vertex includes a two-element vector attribute that holds its texture coordinates. Like other attributes, texture coordinates are interpolated by the rasterizer, so the fragment function receives a pair of smoothly interpolated texture coordinates that can be used to look up
Generating Texture Coordinates with Model I/O
We can ask Model I/O to generate texture coordinates when creating an MDLMesh. This is as simple as configuring another attribute on the Model I/O vertex descriptor and giving it the name MDLVertexAttributeTextureCoordinate. We also set its offset and buffer index appropriately and update the layout of the buffer itself to account for the new data.
mdlVertexDescriptor.vertexAttributes[2].name =
MDLVertexAttributeTextureCoordinate
mdlVertexDescriptor.vertexAttributes[2].format = .float2
mdlVertexDescriptor.vertexAttributes[2].offset = 24
mdlVertexDescriptor.vertexAttributes[2].bufferIndex = 0
mdlVertexDescriptor.bufferLayouts[0].stride = 32So we can give each object a different texture if desired, we replace the color property of the Node class with a texture property:
var texture: MTLTexture?Sampling
Looking up the color of a texel at a set of absolute coordinates is called reading. We might read a texture’s contents when writing a per-pixel image filter, or doing some kind of other operation on a pixel-by-pixel basis.
Because they are normalized and interpolated, texture coordinates often indicate positions “in between” texels, so we can’t just read a single pixel to get the right color of a fragment; we need to look at multiple nearby texels.
We call the process of calculating the color of a texture at a particular set of texture coordinates sampling. It should more accurately be called resampling or reconstruction: textures are already discrete signals, so what we call sampling is really the process of rebuilding the intermediate image data.
Anyway, one of the most common ways to reconstruct a color in between texels is bilinear interpolation. In this scheme, the four texels nearest the sample point are considered. The relative distances of the sample point to the center of each texel along the horizontal axis are found and used as weights to produce an interpolated color for the top and bottom pairs of colors, then the relative distances along the vertical axis are used as weights to average these together to get the final color.
Bilinear interpolation is common, but it is not the only possible sampling scheme. We can also ask Metal to round the texture coordinates to the nearest texel center.
These options are available in Metal in the MTLSamplerMinMagFilter enumeration:
enum MTLSamplerMinMagFilter : UInt {
case nearest
case linear
}We will see below where we can use this enumeration to control sampling.
Address Modes
So far, we have only considered texture coordinates inside the normalized range of 0 to 1. What happens if we sample outside this range? We can choose among several address modes to control Metal’s behavior in this case.
Perhaps the most common address mode is repeat. This causes the texture to “tile” in texture space, wrapping texture coordinates greater than 1 back around to 0. Another useful option is clamp. This forces coordinates less than 0 to 0 and coordinates greater than 1 to 1, causing just the edge texels to repeat outside the texture’s bounds. There are also a few other less common modes; you can consult the Metal documentation for the MTLSamplerAddressMode enumeration for these.
enum MTLSamplerAddressMode : UInt {
case clampToEdge = 0
case repeat = 2
//…other modes…
}Sampler States
We control how textures are sampled by creating sampler state objects. Similar to a depth-stencil state, a sampler state contains numerous properties that we can set all at once by binding the sampler on a render command encoder before drawing.
To create a sampler state, we first configure an MTLSamplerDescriptor object.
let samplerDescriptor = MTLSamplerDescriptor()We can select between normalized and absolute coordinates with the normalizedCoordinates property. Most often, we want to use normalized coordinates, so this property is true by default.
samplerDescriptor.normalizedCoordinates = trueNext, we need to choose which filter mode to use when magnifying and minifying. Magnification happens when there are fewer texels than pixels in a region; minification happens when there are more texels than pixels. We saw the MTLSamplerMinMagFilter above; this is where we use it. The .linear enumerant enables bilinear interpolation.
samplerDescriptor.magFilter = .linear
samplerDescriptor.minFilter = .linearWe also select an address mode for each axis of texture space. Since we are working with 2D textures for now, we set the sAddressMode and tAddressMode to repeat:
samplerDescriptor.sAddressMode = .repeat
samplerDescriptor.tAddressMode = .repeatWe can now make a sampler state by calling the makeSamplerState(descriptor:) on a device:
samplerState = device.makeSamplerState(descriptor: samplerDescriptor)!Using Samplers and Textures in Shaders
We will update our vertex structures once again to match the set of attributes we are using in our vertex descriptor:
struct VertexIn {
float3 position [[attribute(0)]];
float3 normal [[attribute(1)]];
float2 texCoords [[attribute(2)]];
};struct VertexOut {
float4 position [[position]];
float3 normal;
float2 texCoords;
};
To pass the texture coordinates through to our fragment function, we make a small update to our vertex function:
out.texCoords = in.texCoords;The biggest updates are to the fragment function. We need to take a sampler object and a texture object so we can sample the texture at each fragment to determine its color.
Here is the updated fragment function signature:
fragment float4 fragment_main(
VertexOut in [[stage_in]],
texture2d<float, access::sample> textureMap [[texture(0)]],
sampler textureSampler [[sampler(0)]])Note the somewhat unusual syntax texture2d<float, access::sample>. The texture2d type is a template, which is akin to a Swift generic. Its template parameters indicate the type of component we want to receive (float or half) and how we will use it (access::read, access::write or access::sample). texture(0) indicates that we will bind the texture at fragment texture slot 0.
Buffers, textures, and samplers have separate binding index sets, so sampler(0) indicates that we will bind the sampler at fragment sampler slot 0. This doesn’t conflict with texture slot 0 because of the difference in type.
To sample a texture, we use the sample method, passing our sampler and texture coordinates. We always get back a 4-element vector, regardless of how many components the source texture has.
float4 color = textureMap.sample(textureSampler, in.texCoords);
return color;Binding Samplers and Textures
Like buffers, we bind samplers and textures on a render command encoder before drawing to use them in our shader functions. We haven’t bound resources for use in a fragment shader before, but since we want to sample the texture for each fragment, we will do that now. To bind a sampler, we use the setFragmentSamplerState(_:index:) method, and to bind a texture, we use the setFragmentTexture(_:index:) method.
We bind each object at the slot we chose when updating our fragment function above:
renderCommandEncoder.setFragmentTexture(node.texture, index: 0)renderCommandEncoder.setFragmentSamplerState(samplerState, index: 0)
We can update our sample app to draw a couple of different shapes to see how Model I/O parameterizes these shapes in texture space.
Soon, we will look at how to use Model I/O to load more complex 3D models, at which point we will unlock a new level of realism.